Speaker Identification using FM Features
نویسندگان
چکیده
The AM-FM modulation model of speech is a nonlinear model that has been successfully used in several branches of speech-related research. However, the significance of the AM-FM features extracted from this model has not been fully explored in applications such as speaker identification systems. This paper shows that frequency modulation (FM) features can improve speaker identification accuracy. Due to the similarity between amplitude modulation (AM) feature and the conventional Mel frequency cepstrum coefficients (MFCC), this paper mainly focuses on the FM feature. The correlation between FM feature components is shown to be very small compared with that of Mel filterbank log energies, thus reducing the need for decorrelation. FM feature components are shown to be very nearly Gaussian distributed. Further, speech synthesis using AM-FM features is performed to compare four existing AM-FM demodulation methods based on the perceptual quality of the synthesized speech. Of these, Digital Energy Separation Algorithm (DESA) gives the best synthesized speech, and is thus used as a front-end in our speaker identification system. Evaluation of speaker identification using FM features on the NIST 2001 database shows a relative improvement in speaker identification accuracy of 2% for male speakers and 9% for female speakers over the conventional MFCC-based frontend.
منابع مشابه
AM-FM Based Robust Speaker Identification in Babble Noise
Speech babble is one of the most challenging noise interference due to its speaker/speech like characteristics for speech and speaker recognition systems. Performance of such systems strongly degrades in the presence of background noise, like the babble noise. Existing techniques solve this problem by additional processing of speech signal to remove noise. In contrast to existing works, the aim...
متن کاملFM features for automatic forensic speaker recognition
Frequency modulation (FM) information from the speech signal is herein proposed to complement the conventional amplitude based features for automatic forensic speaker recognition systems. In addition to presenting the AM-FM model of speech used to generate the proposed frequency modulation features, the significance of frequency modulation for speaker recognition is discussed. Evaluation result...
متن کاملAnalysis of band structures for speaker-specific information in FM feature extraction
Frequency modulation (FM) features are typically extracted using a filterbank, usually based on an auditory frequency scale, however there is psychophysical evidence to suggest that this scale may not be optimal for extracting speakerspecific information. In this paper, speaker-specific information in FM features is analyzed as a function of the filterbank structure at the feature, model and cl...
متن کاملInvestigation of Spectral Centroid Magnitude and Frequency for Speaker Recognition
Most conventional features used in speaker recognition are based on spectral envelope characterizations such as Mel-scale filterbank cepstrum coefficients (MFCC), Linear Prediction Cepstrum Coefficient (LPCC) and Perceptual Linear Prediction (PLP). The MFCC’s success has seen it become a de facto standard feature for speaker recognition. Alternative features, that convey information other than ...
متن کاملPhonetic Speaker Id
This paper describes the exploration of text-independent speaker identification using novel approaches based on speakers’ phonetic features instead of traditional acoustic features. Different phonetic speaker identification approaches are discussed in this paper and evaluated using two speaker identification systems: one multilingual system and one single language multiple-engine system. Furthe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006